AITopics | online linear regression

Self-normalized martingale inequalities lie at the heart of confidence ellipsoids for online least squares and, more broadly, many bandit and reinforcement-learning results. Yet existing vector and scalar results typically rely on bounded covariates and an explicit regularization matrix, producing bounds that are \emph{not scale-invariant}: although the self-normalized quantity is scale-invariant by definition, its standard upper bounds are not. We characterize when scale-invariant upper bounds on self-normalized martingales are possible. Without further assumptions, we prove that nontrivial scale-invariant bounds exist only in dimension $d=1$; moreover, in $d=1$ we obtain $O(\log T)$ scale-invariant self-normalized bounds without any assumptions on the covariates. In contrast, for $d>1$ we show that no nontrivial scale-invariant bound can hold in full generality. We then connect this dichotomy to \emph{doubly-uniform} regret in online linear regression (i.e., regret bounds that are simultaneously independent of the covariate scale and the comparator norm) and use it to resolve the open question of Gaillard, Gerchinovitz, Huard, and Stoltz, \emph{``Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss''} (ALT 2019): in $d=1$ we give an explicit algorithm with $O(\log T)$ doubly-uniform regret, whereas for $d>1$ sublinear doubly-uniform regret is impossible. Finally, under a natural \emph{smoothness} condition (bounded Radon--Nikodym derivatives of the conditional covariate laws with respect to a fixed base measure), we recover sublinear regret for $d>1$ without bounded covariates and derive a self-normalized concentration inequality free of the usual regularization penalties, yielding arguably a first natural scale-invariant bound for adaptive, non-i.i.d. vector martingales.

artificial intelligence, machine learning, martingale, (19 more...)

arXiv.org Machine Learning

2605.01628

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

9d94bf4711fa459812437e5df5978551-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 01:03:27 GMT

algorithm, dynamic regret, sequence, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

9d94bf4711fa459812437e5df5978551-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 01:03:23 GMT

algorithm, dynamic regret, sequence, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

9d94bf4711fa459812437e5df5978551-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 06:52:11 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

9d94bf4711fa459812437e5df5978551-Paper-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 06:52:07 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Online Linear Regression in Dynamic Environments via Discounting

Jacobsen, Andrew, Cutkosky, Ashok

arXiv.org Machine LearningMay-29-2024

We develop algorithms for online linear regression which achieve optimal static and dynamic regret guarantees \emph{even in the complete absence of prior knowledge}. We present a novel analysis showing that a discounted variant of the Vovk-Azoury-Warmuth forecaster achieves dynamic regret of the form $R_{T}(\vec{u})\le O\left(d\log(T)\vee \sqrt{dP_{T}^{\gamma}(\vec{u})T}\right)$, where $P_{T}^{\gamma}(\vec{u})$ is a measure of variability of the comparator sequence, and show that the discount factor achieving this result can be learned on-the-fly. We show that this result is optimal by providing a matching lower bound. We also extend our results to \emph{strongly-adaptive} guarantees which hold over every sub-interval $[a,b]\subseteq[1,T]$ simultaneously.

algorithm, dynamic environment, online linear regression, (14 more...)

arXiv.org Machine Learning

2405.19175

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)

Add feedback

OLR-WA Online Regression with Weighted Average

Abu-Shaira, Mohammad, Speegle, Greg

arXiv.org Artificial IntelligenceJul-6-2023

Machine Learning requires a large amount of training data in order to build accurate models. Sometimes the data arrives over time, requiring significant storage space and recalculating the model to account for the new data. On-line learning addresses these issues by incrementally modifying the model as data is encountered, and then discarding the data. In this study we introduce a new online linear regression approach. Our approach combines newly arriving data with a previously existing model to create a new model. The introduced model, named OLR-WA (OnLine Regression with Weighted Average) uses user-defined weights to provide flexibility in the face of changing data to bias the results in favor of old or new data. We have conducted 2-D and 3-D experiments comparing OLR-WA to a static model using the entire data set. The results show that for consistent data, OLR-WA and the static batch model perform similarly and for varying data, the user can set the OLR-WA to adapt more quickly or to resist change.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.02804

Country:

North America > United States > Texas > McLennan County > Waco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Online (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Online Instrumental Variable Regression: Regret Analysis and Bandit Feedback

Della Vecchia, Riccardo, Basu, Debabrota

arXiv.org Artificial IntelligenceJun-26-2023

Endogeneity, i.e. the dependence between noise and covariates, is a common phenomenon in real data due to omitted variables, strategic behaviours, measurement errors etc. In contrast, the existing analyses of stochastic online linear regression with unbounded noise and linear bandits depend heavily on exogeneity, i.e. the independence between noise and covariates. Motivated by this gap, we study the over-and just-identified Instrumental Variable (IV) regression for stochastic online learning. IV regression and the Two-Stage Least Squares approach to it are widely deployed in economics and causal inference to identify the underlying model from an endogenous dataset. Thus, we propose to use an online variant of Two-Stage Least Squares approach, namely O2SLS, to tackle endogeneity in stochastic online learning. Our analysis shows that O2SLS achieves $\mathcal{O}\left(d_x d_z \log ^2 T\right)$ identification and $\tilde{\mathcal{O}}\left(\gamma \sqrt{d_x T}\right)$ oracle regret after $T$ interactions, where $d_x$ and $d_z$ are the dimensions of covariates and IVs, and $\gamma$ is the bias due to endogeneity. For $\gamma=0$, i.e. under exogeneity, O2SLS achieves $\mathcal{O}\left(d_x^2 \log ^2 T\right)$ oracle regret, which is of the same order as that of the stochastic online ridge. Then, we leverage O2SLS as an oracle to design OFUL-IV, a stochastic linear bandit algorithm that can tackle endogeneity and achieves $\widetilde{\mathcal{O}}\left(\sqrt{d_x d_z T}\right)$ regret. For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV in terms of regrets.

artificial intelligence, endogeneity, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.09357

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education > Educational Setting > Online (0.54)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Data Science (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Online Linear Regression and Its Application to Model-Based Reinforcement Learning

Neural Information Processing SystemsApr-6-2023, 14:48:46 GMT

We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalized) linearly parameterized dynamics. This result builds on Kearns and Singh's work that provides a provably efficient algorithm for finite state MDPs. Our approach is not restricted to the linear setting, and is applicable to other classes of continuous MDPs.

application, model-based reinforcement learning, online linear regression, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

Zhao, Heyang, Zhou, Dongruo, He, Jiafan, Gu, Quanquan

arXiv.org Artificial IntelligenceMar-27-2023

We study the problem of online generalized linear regression in the stochastic setting, where the label is generated from a generalized linear model with possibly unbounded additive noise. We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise. More specifically, for $\sigma$-sub-Gaussian label noise, our analysis provides a regret upper bound of $O(\sigma^2 d \log T) + o(\log T)$, where $d$ is the dimension of the input vector, $T$ is the total number of rounds. We also prove a $\Omega(\sigma^2d\log(T/d))$ lower bound for stochastic online linear regression, which indicates that our upper bound is nearly optimal. In addition, we extend our analysis to a more refined Bernstein noise condition. As an application, we study generalized linear bandits with heteroscedastic noise and propose an algorithm based on FTRL to achieve the first variance-aware regret bound.

artificial intelligence, linear regression, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2202.13603

Country: